Search for: All records

Creators/Authors contains: "Gagne, David John"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Ensemble Extreme Precipitation Forecasts Using Generative Artificial Intelligence

https://doi.org/10.1175/AIES-D-24-0063.1

Sha, Yingkai; Sobash, Ryan A; Gagne, David John (April 2025, Artificial Intelligence for the Earth Systems)

Abstract An ensemble postprocessing method is developed to improve the probabilistic forecasts of extreme precipitation events across the conterminous United States (CONUS). The method combines a 3D vision transformer (ViT) for bias correction with a latent diffusion model (LDM), a generative artificial intelligence (AI) method, to postprocess 6-hourly precipitation ensemble forecasts and produce an enlarged generative ensemble that contains spatiotemporally consistent precipitation trajectories. These trajectories are expected to improve the characterization of extreme precipitation events and offer skillful multiday accumulated and 6-hourly precipitation guidance. The method is tested using the Global Ensemble Forecast System (GEFS) precipitation forecasts out to day 6 and is verified against the Climatology-Calibrated Precipitation Analysis (CCPA) data. Verification results indicate that the method generated skillful ensemble members with improved continuous ranked probabilistic skill scores (CRPSSs) and Brier skill scores (BSSs) over the raw operational GEFS and a multivariate statistical postprocessing baseline. It showed skillful and reliable probabilities for events at extreme precipitation thresholds. Explainability studies were further conducted, which revealed the decision-making process of the method and confirmed its effectiveness on ensemble member generation. This work introduces a novel, generative AI–based approach to address the limitation of small numerical ensembles and the need for larger ensembles to identify extreme precipitation events. Significance StatementWe use a new artificial intelligence (AI) technique to improve extreme precipitation forecasts from a numerical weather prediction ensemble, generating more scenarios that better characterize extreme precipitation events. This AI-generated ensemble improved the accuracy of precipitation forecasts and probabilistic warnings for extreme precipitation events. The study explores AI methods to generate precipitation forecasts and explains the decision-making mechanisms of such AI techniques to prove their effectiveness.
more » « less
Full Text Available
(Re)Conceptualizing trustworthy AI: A foundation for change

https://doi.org/10.1016/j.artint.2025.104309

Wirz, Christopher D; Demuth, Julie L; Bostrom, Ann; Cains, Mariana G; Ebert-Uphoff, Imme; Gagne, David John; Schumacher, Andrea; McGovern, Amy; Madlambayan, Deianna (May 2025, Artificial Intelligence)

Full Text Available
Improving Medium Range Severe Weather Prediction through Transformer Post-processing of AI Weather Forecasts

https://doi.org/10.1175/AIES-D-25-0045.1

Hua, Zhanxiang; Sobash, Ryan A; Gagne, David John; Sha, Yingkai; Anderson-Frey, Alexandra (November 2025, Artificial Intelligence for the Earth Systems)

Abstract Improving the skill of medium-range (3–8 day) severe weather prediction is crucial for mitigating societal impacts. This study introduces a novel approach leveraging decoder-only transformer networks to post-process AI-based weather forecasts, specifically from the Pangu-Weather model, for improved severe weather guidance. Unlike traditional post-processing methods that use a dense neural network to predict the probability of severe weather using discrete forecast samples, our method treats forecast lead times as sequential “tokens”, enabling the transformer to learn complex temporal relationships within the evolving atmospheric state. We compare this approach against post-processing of the Global Forecast System (GFS) using both a traditional dense neural network and our transformer, as well as configurations that exclude convective parameters to fairly evaluate the impact of using the Pangu-Weather AI model. Results demonstrate that the transformer-based post-processing significantly enhances forecast skill compared to dense neural networks. Furthermore, AI-driven forecasts, particularly Pangu-Weather initialized from high resolution analysis, exhibit superior performance to GFS in the medium-range, even without explicit convective parameters. Our approach offers improved accuracy, and reliability, which also provides interpretability through feature attribution analysis, advancing medium-range severe weather prediction capabilities.
more » « less
Full Text Available
Measuring Sharpness of AI-Generated Meteorological Imagery

https://doi.org/10.1175/AIES-D-24-0083.1

Ebert-Uphoff, Imme; Ver_Hoef, Lander; Schreck, John S; Stock, Jason; Molina, Maria J; McGovern, Amy; Yu, Michael; Petzke, Bill; Hilburn, Kyle; Hall, David M; et al (June 2025, Artificial Intelligence for the Earth Systems)

Abstract AI-based algorithms are emerging in many meteorological applications that produce imagery as output, including for global weather forecasting models. However, the imagery produced by AI algorithms, especially by convolutional neural networks (CNNs), is often described as too blurry to look realistic, partly because CNNs tend to represent uncertainty as blurriness. This blurriness can be undesirable since it might obscure important meteorological features. More complex AI models, such as Generative AI models, produce images that appear to be sharper. However, improved sharpness may come at the expense of a decline in other performance criteria, such as standard forecast verification metrics. To navigate any trade-off between sharpness and other performance metrics it is important to quantitatively assess those other metrics along with sharpness. While there is a rich set of forecast verification metrics available for meteorological images, none of them focus on sharpness. This paper seeks to fill this gap by 1) exploring a variety of sharpness metrics from other fields, 2) evaluating properties of these metrics, 3) proposing the new concept of Gaussian Blur Equivalence as a tool for their uniform interpretation, and 4) demonstrating their use for sample meteorological applications, including a CNN that emulates radar imagery from satellite imagery (GREMLIN) and an AI-based global weather forecasting model (GraphCast).
more » « less
Full Text Available
Ethics in climate AI: From theory to practice

https://doi.org/10.1371/journal.pclm.0000465

Acquaviva, Viviana; Barnes, Elizabeth A; Gagne, David John; McKinley, Galen A; Thais, Savannah (August 2024, PLOS Climate)
Males, Jamie (Ed.)
Full Text Available
Generative Ensemble Deep Learning Severe Weather Prediction from a Deterministic Convection-Allowing Model

https://doi.org/10.1175/AIES-D-23-0094.1

Sha, Yingkai; Sobash, Ryan A; Gagne, David John (April 2024, Artificial Intelligence for the Earth Systems)

Abstract An ensemble postprocessing method is developed for the probabilistic prediction of severe weather (tornadoes, hail, and wind gusts) over the conterminous United States (CONUS). The method combines conditional generative adversarial networks (CGANs), a type of deep generative model, with a convolutional neural network (CNN) to postprocess convection-allowing model (CAM) forecasts. The CGANs are designed to create synthetic ensemble members from deterministic CAM forecasts, and their outputs are processed by the CNN to estimate the probability of severe weather. The method is tested using High-Resolution Rapid Refresh (HRRR) 1–24-h forecasts as inputs and Storm Prediction Center (SPC) severe weather reports as targets. The method produced skillful predictions with up to 20% Brier skill score (BSS) increases compared to other neural-network-based reference methods using a testing dataset of HRRR forecasts in 2021. For the evaluation of uncertainty quantification, the method is overconfident but produces meaningful ensemble spreads that can distinguish good and bad forecasts. The quality of CGAN outputs is also evaluated. Results show that the CGAN outputs behave similarly to a numerical ensemble; they preserved the intervariable correlations and the contribution of influential predictors as in the original HRRR forecasts. This work provides a novel approach to postprocess CAM output using neural networks that can be applied to severe weather prediction. Significance StatementWe use a new machine learning (ML) technique to generate probabilistic forecasts of convective weather hazards, such as tornadoes and hailstorms, with the output from high-resolution numerical weather model forecasts. The new ML system generates an ensemble of synthetic forecast fields from a single forecast, which are then used to train ML models for convective hazard prediction. Using this ML-generated ensemble for training leads to improvements of 10%–20% in severe weather forecast skills compared to using other ML algorithms that use only output from the single forecast. This work is unique in that it explores the use of ML methods for producing synthetic forecasts of convective storm events and using these to train ML systems for high-impact convective weather prediction.
more » « less
Full Text Available
Physically Explainable Deep Learning for Convective Initiation Nowcasting Using GOES-16 Satellite Observations

https://doi.org/10.1175/AIES-D-23-0098.1

Fan, Da; Greybush, Steven J; Clothiaux, Eugene E; Gagne, David John (May 2024, Artificial Intelligence for the Earth Systems)

Abstract Convective initiation (CI) nowcasting remains a challenging problem for both numerical weather prediction models and existing nowcasting algorithms. In this study, an object-based probabilistic deep learning model is developed to predict CI based on multichannel infrared GOES-16 satellite observations. The data come from patches surrounding potential CI events identified in Multi-Radar Multi-Sensor Doppler weather radar products over the Great Plains region from June and July 2020 and June 2021. An objective radar-based approach is used to identify these events. The deep learning model significantly outperforms the classical logistic model at lead times up to 1 hour, especially on the false alarm ratio. Through case studies, the deep learning model exhibits dependence on the characteristics of clouds and moisture at multiple altitudes. Model explanation further reveals that the contribution of features to model predictions is significantly dependent on the baseline, a reference point against which the prediction is compared. Under a moist baseline, moisture gradients in the lower and middle troposphere contribute most to correct CI forecasts. In contrast, under clear-sky baselines, correct CI forecasts are dominated by cloud-top features, including cloud-top glaciation, height, and cloud coverage. Our study demonstrates the advantage of using different baselines in further understanding model behavior and gaining scientific insights.
more » « less
Full Text Available
Exploring NWS Forecasters’ Assessment of AI Guidance Trustworthiness

https://doi.org/10.1175/WAF-D-23-0180.1

Cains, Mariana G; Wirz, Christopher D; Demuth, Julie L; Bostrom, Ann; Gagne, David John; McGovern, Amy; Sobash, Ryan A; Madlambayan, Deianna (August 2024, Weather and Forecasting)

Abstract As artificial intelligence (AI) methods are increasingly used to develop new guidance intended for operational use by forecasters, it is critical to evaluate whether forecasters deem the guidance trustworthy. Past trust-related AI research suggests that certain attributes (e.g., understanding how the AI was trained, interactivity, and performance) contribute to users perceiving the AI as trustworthy. However, little research has been done to examine the role of these and other attributes for weather forecasters. In this study, we conducted 16 online interviews with National Weather Service (NWS) forecasters to examine (i) how they make guidance use decisions and (ii) how the AI model technique used, training, input variables, performance, and developers as well as interacting with the model output influenced their assessments of trustworthiness of new guidance. The interviews pertained to either a random forest model predicting the probability of severe hail or a 2D convolutional neural network model predicting the probability of storm mode. When taken as a whole, our findings illustrate how forecasters’ assessment of AI guidance trustworthiness is a process that occurs over time rather than automatically or at first introduction. We recommend developers center end users when creating new AI guidance tools, making end users integral to their thinking and efforts. This approach is essential for the development of useful andusedtools. The details of these findings can help AI developers understand how forecasters perceive AI guidance and inform AI development and refinement efforts. Significance StatementWe used a mixed-methods quantitative and qualitative approach to understand how National Weather Service (NWS) forecasters 1) make guidance use decisions within their operational forecasting process and 2) assess the trustworthiness of prototype guidance developed using artificial intelligence (AI). When taken as a whole, our findings illustrate that forecasters’ assessment of AI guidance trustworthiness is a process that occurs over time rather than automatically and suggest that developers must center the end user when creating new AI guidance tools to ensure that the developed tools are useful andused.
more » « less
Full Text Available
Identifying and Categorizing Bias in AI/ML for Earth Sciences

https://doi.org/10.1175/BAMS-D-23-0196.1

McGovern, Amy; Bostrom, Ann; McGraw, Marie; Chase, Randy J; Gagne, David John; Ebert-Uphoff, Imme; Musgrave, Kate D; Schumacher, Andrea (March 2024, Bulletin of the American Meteorological Society)

Abstract Artificial intelligence (AI) can be used to improve performance across a wide range of Earth system prediction tasks. As with any application of AI, it is important for AI to be developed in an ethical and responsible manner to minimize bias and other effects. In this work, we extend our previous work demonstrating how AI can go wrong with weather and climate applications by presenting a categorization of bias for AI in the Earth sciences. This categorization can assist AI developers to identify potential biases that can affect their model throughout the AI development life cycle. We highlight examples from a variety of Earth system prediction tasks of each category of bias.
more » « less
Full Text Available
Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation for Earth System Science Applications

https://doi.org/10.1175/AIES-D-23-0093.1

Schreck, John S; Gagne, David John; Becker, Charlie; Chapman, William E; Elmore, Kim; Fan, Da; Gantos, Gabrielle; Kim, Eliot; Kimpara, Dhamma; Martin, Thomas; et al (October 2024, Artificial Intelligence for the Earth Systems)

Abstract Robust quantification of predictive uncertainty is a critical addition needed for machine learning applied to weather and climate problems to improve the understanding of what is driving prediction sensitivity. Ensembles of machine learning models provide predictive uncertainty estimates in a conceptually simple way but require multiple models for training and prediction, increasing computational cost and latency. Parametric deep learning can estimate uncertainty with one model by predicting the parameters of a probability distribution but does not account for epistemic uncertainty. Evidential deep learning, a technique that extends parametric deep learning to higher-order distributions, can account for both aleatoric and epistemic uncertainties with one model. This study compares the uncertainty derived from evidential neural networks to that obtained from ensembles. Through applications of the classification of winter precipitation type and regression of surface-layer fluxes, we show evidential deep learning models attaining predictive accuracy rivaling standard methods while robustly quantifying both sources of uncertainty. We evaluate the uncertainty in terms of how well the predictions are calibrated and how well the uncertainty correlates with prediction error. Analyses of uncertainty in the context of the inputs reveal sensitivities to underlying meteorological processes, facilitating interpretation of the models. The conceptual simplicity, interpretability, and computational efficiency of evidential neural networks make them highly extensible, offering a promising approach for reliable and practical uncertainty quantification in Earth system science modeling. To encourage broader adoption of evidential deep learning, we have developed a new Python package, Machine Integration and Learning for Earth Systems (MILES) group Generalized Uncertainty for Earth System Science (GUESS) (MILES-GUESS) (https://github.com/ai2es/miles-guess), that enables users to train and evaluate both evidential and ensemble deep learning. Significance StatementThis study demonstrates a new technique, evidential deep learning, for robust and computationally efficient uncertainty quantification in modeling the Earth system. The method integrates probabilistic principles into deep neural networks, enabling the estimation of both aleatoric uncertainty from noisy data and epistemic uncertainty from model limitations using a single model. Our analyses reveal how decomposing these uncertainties provides valuable insights into reliability, accuracy, and model shortcomings. We show that the approach can rival standard methods in classification and regression tasks within atmospheric science while offering practical advantages such as computational efficiency. With further advances, evidential networks have the potential to enhance risk assessment and decision-making across meteorology by improving uncertainty quantification, a longstanding challenge. This work establishes a strong foundation and motivation for the broader adoption of evidential learning, where properly quantifying uncertainties is critical yet lacking.
more » « less
Full Text Available

« Prev Next »